Normal view MARC view ISBD view

Ensemble learning approach to classifying documents based on formal and informal writing styles

By: Karunarathna, K M G.
Contributor(s): Rupasingha, R A H M.
Publisher: Hyderabad IUP Publications 2022Edition: Vol,18(3), Sep.Description: 27-49p.Subject(s): EXTC Engineering In: IUP journal of information technologySummary: Recent advances in technology, many students and scholars have been tempted to use the internet as their main educational resource since they can obtain a variety of documents online.these documents can be classified as either formal or informal in writing style, involving different linguistics. the paper presents a method to identify automatically the style of a particular document.first, a dataset of online documents was compiled and preprocessed.next features ware extracted via a term frequency- inverse document frequency vectorizer. classification models were then built using six classification algorithms. initially, five machine learning algorithms- random forest, decision tree,support vactor machine, multilayer perceptionnn,and naive bayes- were used. of these five algorithms, the random forest algoritham performed best, obtaining an accuracy of 87.44%,high value for precision and recall,and an f measure with the lowest error rate. in the second experiment,an ensemble learning method was used, whereby a vote algoritham was used with a combination of the five algorithms.this method obtained an accuracy of 91.96% the method combines several algorithms.
Tags from this library: No tags from this library for this title. Log in to add tags.
    average rating: 0.0 (0 votes)
Item type Current location Call number Status Date due Barcode Item holds
Articles Abstract Database Articles Abstract Database School of Engineering & Technology
Archieval Section
Not for loan 2023-0123
Total holds: 0

Recent advances in technology, many students and scholars have been tempted to use the internet as their main educational resource since they can obtain a variety of documents online.these documents can be classified as either formal or informal in writing style, involving different linguistics. the paper presents a method to identify automatically the style of a particular document.first, a dataset of online documents was compiled and preprocessed.next features ware extracted via a term frequency- inverse document frequency vectorizer. classification models were then built using six classification algorithms. initially, five machine learning algorithms- random forest, decision tree,support vactor machine, multilayer perceptionnn,and naive bayes- were used. of these five algorithms, the random forest algoritham performed best, obtaining an accuracy of 87.44%,high value for precision and recall,and an f measure with the lowest error rate. in the second experiment,an ensemble learning method was used, whereby a vote algoritham was used with a combination of the five algorithms.this method obtained an accuracy of 91.96% the method combines several algorithms.

There are no comments for this item.

Log in to your account to post a comment.

Click on an image to view it in the image viewer

Unique Visitors hit counter Total Page Views free counter
Implemented and Maintained by AIKTC-KRRC (Central Library).
For any Suggestions/Query Contact to library or Email: librarian@aiktc.ac.in | Ph:+91 22 27481247
Website/OPAC best viewed in Mozilla Browser in 1366X768 Resolution.

Powered by Koha